-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support default checksums #1191
Conversation
Signed-off-by: 0marperez <[email protected]>
…flexible-checksums
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
Note to reviewers: I'll address the breaking API changes and failing protocol issues soon. In the meantime, I’d appreciate a review of the checksum-related changes. |
* Calculates a request's checksum. | ||
* | ||
* If the checksum will be sent as a header, calculate the checksum. | ||
* If a user supplies a checksum via an HTTP header no calculation will be done. The exception is MD5, if a user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness/clarification: "Calculates a request's checksum" and "If a user supplies a checksum via an HTTP header no calculation will be done." conflict with each other
* | ||
* @param requestChecksumRequired Model sourced flag indicating if checksum calculation is mandatory. | ||
* @param requestChecksumCalculation Configuration option that determines when checksum calculation should be done. | ||
* @param userSelectedChecksumAlgorithm The checksum algorithm that the user selected for the request, may be null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: requestChecksumAlgorithm
private val forcedToCalculateChecksum = requestChecksumRequired || requestChecksumCalculation == HttpChecksumConfigOption.WHEN_SUPPORTED | ||
private val checksumHeader = StringBuilder("x-amz-checksum-") | ||
private val defaultChecksumAlgorithm = lazy { Crc32() } | ||
private val defaultChecksumAlgorithmHeaderPostfix = "crc32" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postfix -> Suffix
* The header must start with "x-amz-checksum-" followed by the checksum algorithm's name. | ||
* MD5 is not considered a valid checksum algorithm. | ||
*/ | ||
private fun userProviderChecksumHeader(request: HttpRequest, logger: Logger): String? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: can be restructured as an extension function HttpRequest.checksumHeader(logger: Logger): String?
and naming: userProvidedChecksumHeader
* Users can check which checksum was validated by referencing the `ResponseChecksumValidated` execution context variable. | ||
* | ||
* @param shouldValidateResponseChecksumInitializer A function which uses the input [I] to return whether response checksum validation should occur | ||
* @param responseValidationRequired Flag indicating if the checksum validation is mandatory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs: "Model sourced flag"
.removePrefix("x-amz-checksum-") | ||
.toHashFunction() ?: throw ClientException("Could not parse checksum algorithm from header $checksumHeader") | ||
|
||
if (context.protocolResponse.body is HttpBody.Bytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this branch added? The spec says we should delay checksum calculation until the body is consumed:
Where possible, SDKs MUST defer this calculation and validation until the payload is actually consumed by the user i.e. a payload must not be read twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toHashingBody
doesn't support Bytes
bodies. I decided to calculate the checksum in memory instead of adding support for Bytes
to toHashingBody
. It should be safe if the request is not being streamed but I think you're right that we're not following the spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toHashingBody
doesn't supportBytes
bodies
Does it need to? The previous implementation never seemed to need it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does, the spec has some unit tests with non-streaming bodies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline and we'll be removing support for HttpBody.Bytes
from FlexibleChecksumsResponseInterceptor
because HTTP bodies are never bytes. Source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removing support for HttpBody.Bytes
is a bad idea. Our current implementations of OkHttp and CRT engines do not return bytes, that's true, but future implementations or other engines wrapped by users may. In addition, interceptors may be used to rewrite response bodies by streaming them into memory, processing them, and then rewriting the body in a new form (possibly HttpBody.Bytes
). I think we need to ensure each type of HttpBody
is handled correctly.
@@ -126,23 +133,6 @@ class FlexibleChecksumsRequestInterceptorTest { | |||
assertEquals(0, call.request.headers.getNumChecksumHeaders()) | |||
} | |||
|
|||
@Test | |||
fun itSetsChecksumHeaderViaExecutionContext() = runTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this test removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're no longer setting the checksum header via execution context
@@ -163,29 +168,4 @@ class FlexibleChecksumsResponseInterceptorTest { | |||
|
|||
op.roundTrip(client, TestInput("input")) | |||
} | |||
|
|||
@Test | |||
fun testSkipsValidationWhenDisabled() = runTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this test removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be covered under ChecksumConfigTest
> ResponseChecksumValidation
but I added it back
|
||
public enum class HttpChecksumConfigOption { | ||
/** | ||
* SDK will create/validate checksum if the service marks it as required or if this is set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs: create -> calculate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: missing tests for when requestChecksumRequired = false
and requestChecksumCalculation = HttpChecksumConfigOption.WHEN_REQUIRED
. Same for response validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request test cases are already here. I think the response validation unit tests need one more to be exhaustive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those tests in ChecksumConfigTest.kt aren't validating interceptor behavior like the tests in this file do.
* @param checksumAlgorithmNameInitializer an optional function which parses the input [I] to return the checksum algorithm name. | ||
* if not set, then the [HttpOperationContext.ChecksumAlgorithm] execution context attribute will be used. | ||
* If the request will be streamed: | ||
* - The checksum calculation is done asynchronously using a hashing & completing body. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "asynchronously" → "during transmission"
* - The checksum will be sent in a trailing header, once the request is consumed. | ||
* | ||
* If the request will not be streamed: | ||
* - The checksum calculation is done synchronously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "synchronously" → "before transmission"
private val defaultChecksumAlgorithm = lazy { Crc32() } | ||
private val defaultChecksumAlgorithmHeaderPostfix = "crc32" | ||
|
||
private val checksumAlgorithm = userSelectedChecksumAlgorithm?.let { | ||
val hashFunction = userSelectedChecksumAlgorithm.toHashFunction() | ||
if (hashFunction == null || !hashFunction.isSupported) { | ||
throw ClientException("Checksum algorithm '$userSelectedChecksumAlgorithm' is not supported for flexible checksums") | ||
} | ||
checksumHeader.append(userSelectedChecksumAlgorithm.lowercase()) | ||
hashFunction | ||
} ?: if (forcedToCalculateChecksum) { | ||
checksumHeader.append(defaultChecksumAlgorithmHeaderPostfix) | ||
defaultChecksumAlgorithm.value | ||
} else { | ||
null | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Seems unnecessary to declare forcedToCalculateChecksum
, defaultChecksumAlgorithm
, and defaultChecksumAlgorithmHeaderPostfix
as fields only to use them in one if
branch. Could this just be:
?: if (requestChecksumRequired || requestChecksumCalculation == HttpChecksumConfigOption.WHEN_SUPPORTED) {
checksumHeader.append("crc32")
Crc32()
} else
userProviderChecksumHeader(context.protocolRequest, logger)?.let { | ||
logger.debug { "User supplied a checksum via header, skipping checksum calculation" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: These log messages which talk about the user feel awkward. Log messages are for users so it seems strange to mention the user in the third person. I'd suggest rewording these to be more about data/actions and less about the actors involved (e.g., "Found a provided checksum in the request, skipping checksum calculation").
|
||
deferredChecksum.complete(checksum) | ||
} else { | ||
logger.debug { "Calculating checksum asynchronously" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We've lost the log message that we're calculating a checksum asynchronously during transmission. I'd suggest moving your "Calculating checksum using '$checksumAlgorithm'"
message into the if
/else
clauses, tweaking each to mention whether the checksum is being calculated before or during transmission.
private fun String.isCompositeChecksum(): Boolean { | ||
// Ends with "-#" where "#" is a number between 1-1000 | ||
val regex = Regex("-([1-9][0-9]{0,2}|1000)$") | ||
return regex.containsMatchIn(this) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: When this logic is made S3 specific, we shouldn't assert on the part number fitting inside a certain range. Just "-(\d)+$"
should suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please add tests for HttpChecksumConfigOption.WHEN_REQUIRED
. Applies to response interceptors as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request test cases are already here. I think the response validation unit tests need one more to be exhaustive.
@Test | ||
fun itSetsChecksumHeaderViaExecutionContext() = runTest { | ||
checksums.forEach { (checksumAlgorithmName, expectedChecksumValue) -> | ||
val req = HttpRequestBuilder().apply { | ||
body = HttpBody.fromBytes("<Foo>bar</Foo>".encodeToByteArray()) | ||
} | ||
|
||
val op = newTestOperation<Unit, Unit>(req, Unit) | ||
op.context[HttpOperationContext.ChecksumAlgorithm] = checksumAlgorithmName | ||
op.interceptors.add(FlexibleChecksumsRequestInterceptor<Unit>()) | ||
|
||
op.roundTrip(client, Unit) | ||
val call = op.context.attributes[HttpOperationContext.HttpCallList].first() | ||
assertEquals(expectedChecksumValue, call.request.headers["x-amz-checksum-$checksumAlgorithmName"]) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Do we still use HttpOperationContext.ChecksumAlgorithm
or can it be deprecated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's still used for MD5 checksums in the httpChecksumRequiredTrait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update on this, it's no longer used anywhere so we can deprecate/remove it
@Test | ||
fun testSkipsValidationWhenDisabled() = runTest { | ||
val req = HttpRequestBuilder() | ||
val op = newTestOperation<TestInput>(req) | ||
|
||
op.interceptors.add( | ||
FlexibleChecksumsResponseInterceptor<TestInput> { | ||
false | ||
}, | ||
) | ||
|
||
val responseChecksumHeaderName = "x-amz-checksum-crc32" | ||
|
||
val responseHeaders = Headers { | ||
append(responseChecksumHeaderName, "incorrect-checksum-would-throw-if-validated") | ||
} | ||
|
||
val client = getMockClient(response, responseHeaders) | ||
|
||
val output = op.roundTrip(client, TestInput("input")) | ||
output.body.readAll() | ||
|
||
assertNull(op.context.getOrNull(ChecksumHeaderValidated)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why did we delete this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be covered under ChecksumConfigTest > ResponseChecksumValidation but I added it back
/** | ||
* SDK will create/validate checksum if the service marks it as required or if this is set. | ||
*/ | ||
WHEN_SUPPORTED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "or if this is set" → "or if the service offers optional checksums"
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…flexible-checksums
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
// FIXME: Re-enable. This test is broken after a smithy update: https://github.com/smithy-lang/smithy/pull/2467 | ||
// ProtocolTest("aws-json-10", "aws.protocoltests.json10#JsonRpc10"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I know we talked about disabling some tests that are failing in Smithy 1.53.0, but these are disabling the entire test suite, which we don't want to do
@@ -32,7 +33,7 @@ import software.amazon.smithy.rulesengine.traits.EndpointTestsTrait | |||
* shape's closure for example) | |||
*/ | |||
@Suppress("EXTENSION_SHADOWED_BY_MEMBER") | |||
inline fun <reified T : Shape> Model.shapes(): List<T> = shapes(T::class.java).toList() | |||
inline fun <reified T : Shape> Model.shapes(): List<T> = shapes(T::class.java).kotlinToList() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has never been a problem before, what changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a refactor in AWS SDK Kotlin to make the codegen tests KMP, which led to the JVM CI checks failing because we've been using JVM 17. I set the JVM version in the tests to 1.8 and the toList
function used above is only available since JVM 16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that makes sense. You should be able to use shapes(T::class.java).collect(Collectors.toList())
instead of hacking around with the Kotlin toList
import software.amazon.smithy.model.traits.HttpChecksumRequiredTrait | ||
|
||
/** | ||
* Handles the `httpChecksumRequired` trait. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/docs: Explain more about how it "handles" the trait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "meat" of the integration is the middleware and both of them have their own Kdocs. It feels like docs overkill to also add a summary of what each middleware is doing here
writer.write( | ||
"op.context[#T.DefaultChecksumAlgorithm] = #S", | ||
RuntimeTypes.HttpClient.Operation.HttpOperationContext, | ||
"MD5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question/correctness: Can/should we store this as a HashFunction
rather than a String
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, I think we would have to initialize the HashFunction
in the context and its seems slightly worse than what we have right now. Thoughts?
|
||
/** | ||
* @return The default checksum algorithm name, null if default checksums are disabled. | ||
*/ | ||
internal fun defaultChecksumAlgorithmName(context: ProtocolRequestInterceptorContext<Any, HttpRequest>): String? = | ||
context.executionContext.getOrNull(HttpOperationContext.DefaultChecksumAlgorithm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: This seems like the wrong place to define this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have said HttpChecksumRequiredInterceptor
but I see it's also used by FlexibleChecksumsRequestInterceptor
. I think it's fine here.
style: make it an extension val
internal val ProtocolRequestInterceptorContext<Any, HttpRequest>.defaultChecksumAlgorithmName: String?
get() = executionContext.getOrNull(HttpOperationContext.DefaultChecksumAlgorithm)
/** | ||
* Convert an [HttpBody] with an underlying [HashingSource] or [HashingByteReadChannel] | ||
* to a [CompletingSource] or [CompletingByteReadChannel], respectively. | ||
*/ | ||
@InternalApi | ||
public fun HttpBody.toCompletingBody(deferred: CompletableDeferred<String>): HttpBody = when (this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These used to be private
, now they are @InternalApi public
, meaning they are included in our API dumps and part of backwards compatibility. Was this move necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I moved this here because I was going to add support for streaming checksums in the HttpCheksumsRequired
interceptor
*/ | ||
@InternalApi | ||
public val HashFunction.isSupportedForFlexibleChecksums: Boolean get() = | ||
algorithmsSupportedForFlexibleChecksums.contains(this::class.simpleName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: Relying on this::class.simpleName
seems unsafe.
Better solution:
@InternalApi
public val HashFunction.isSupportedForFlexibleChecksums: Boolean
get() = when (this) {
is Crc32, is Crc32c, is Sha1, is Sha256 -> true
else -> false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agree, that's what we had before but we would have to keep track of what checksum algorithms are supported for flexible checksums in two places. In HashFunction.isSupportedForFlexibleChecksums
& algorithmsSupportedForFlexibleChecksums
.
I made a tradeoff here, it seems more risky to possibly miss updating what algorithms we support in both locations than relying on the class name. Plus we have unit tests that should lessen some of our concerns here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other use of algorithmsSupportedForFlexibleChecksums
is in a log message. I'd rather remove the log message than keep this class reflection
@InternalApi | ||
public fun runBlocking(block: suspend () -> Unit) { | ||
runBlocking { | ||
block() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: What is this used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's so we can have runBlocking
as a runtime type, it's used in the presigner generator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just add a new runBlocking
type under the existing KotlinxCoroutines
runtime type
@@ -0,0 +1,40 @@ | |||
package aws.smithy.kotlin.runtime.client.config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: missing license at the top. same applies to Concurrency.kt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think we need to adjust our Ktlint for be more stringent when it comes to license headers
} | ||
} | ||
|
||
public enum class HttpChecksumConfigOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs: missing KDocs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add a new rule to Ktlint about Kdocs? Not sure it that's possible but I think I'd be helpful
} | ||
} | ||
|
||
public enum class HttpChecksumConfigOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: Right now request and response checksum configs both use this enum. In the future, their config options might diverge, which will force us to make an API break. I'd recommend splitting this into HttpChecksumRequestConfigOption
and HttpChecksumResponseConfigOption
now to get ahead of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like just speculation for now, I say we only separate them if that time comes. Changing the type of this config option would be a breaking change for all SDKs. I think it's more likely that we would make additive changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not speculative, there was some discussion during the spec review about adding more options to either request or response but not both.
The spec calls out RequestChecksumCalculation
and ResponseChecksumValidation
as separate config types, so adding a new config option to either of those wouldn't be a breaking change for SDKs which implement it as the spec says.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
/** | ||
* Handles checksum calculation so that checksums will be cached during retry loop | ||
*/ | ||
@InternalApi | ||
public abstract class AbstractChecksumInterceptor : HttpInterceptor { | ||
private var cachedChecksum: String? = null | ||
|
||
override suspend fun modifyBeforeSigning(context: ProtocolRequestInterceptorContext<Any, HttpRequest>): HttpRequest { | ||
cachedChecksum ?: calculateChecksum(context).also { cachedChecksum = it } | ||
return cachedChecksum?.let { applyChecksum(context, it) } ?: context.protocolRequest | ||
cachedChecksum = cachedChecksum ?: calculateChecksum(context) | ||
|
||
return if (cachedChecksum != null) { | ||
applyChecksum(context, cachedChecksum!!) | ||
} else { | ||
context.protocolRequest | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: "checksums will be cached during retry loop"...is that what we want? One of the things which could occur between a failed first attempt and a subsequent retry is the body changing because, say, it's a file stream and the underlying file has been modified. In that case, we'd send the old checksum and the new file body?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was originally added for performance reasons. There's a discussion about it here. In your example someone/something would have to modify the underlying file in a file stream between retries. I'm not sure how often that happens for our customers but I'm open to getting rid of it.
} | ||
|
||
logger.debug { "Checksum wasn't provided, selected, or isn't required: skipping checksum calculation" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness: This message still fires in the case we're doing the chunked streaming hashing but it shouldn't.
/** | ||
* Determines what checksum algorithm to use, null if none is required | ||
*/ | ||
private fun resolveChecksumAlgorithm( | ||
requestChecksumRequired: Boolean, | ||
requestChecksumCalculation: RequestHttpChecksumConfig?, | ||
requestChecksumAlgorithm: String?, | ||
context: ProtocolRequestInterceptorContext<Any, HttpRequest>, | ||
): HashFunction? = | ||
requestChecksumAlgorithm | ||
?.toHashFunctionOrThrow() | ||
?.takeIf { it.isSupportedForFlexibleChecksums } | ||
?: context.defaultChecksumAlgorithmName | ||
?.toHashFunctionOrThrow() | ||
?.takeIf { | ||
(requestChecksumRequired || requestChecksumCalculation == RequestHttpChecksumConfig.WHEN_SUPPORTED) && | ||
it.isSupportedForFlexibleChecksums | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment: Nice fluid/functional style! 🤩
else -> { | ||
val bodyBytes = req.body.readAll()!! | ||
req.body = bodyBytes.toHttpBody() | ||
val bodyBytes = req.body.readAll() ?: byteArrayOf() | ||
if (req.body.isOneShot) req.body = bodyBytes.toHttpBody() | ||
bodyBytes.hash(checksumAlgorithm).encodeBase64String() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: This is an interesting change. We're now replacing the request body only if it's a oneshot body. Replayable (multi-shot?) bodies don't get replaced and so we'll read them again later when transmitting the payload.
In theory that could be slightly more performant if the backing channel/source was already in memory because now we'd no longer be duplicating it. But I think it could also be slightly less performant if the backing channel/source has to be read from IO (disk, network, etc.) since we'd be doing that IO twice.
What led to this change? Was replayable stream performance a factor or was it something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it seemed like we could improve performance a bit here but I hadn't considered what you said. I don't have a strong inclination towards keeping the optimization fwiw
override suspend fun modifyBeforeDeserialization(context: ProtocolResponseInterceptorContext<Any, HttpRequest, HttpResponse>): HttpResponse { | ||
if (!shouldValidateResponseChecksum) { | ||
return context.protocolResponse | ||
} | ||
val logger = coroutineContext.logger<FlexibleChecksumsResponseInterceptor>() | ||
|
||
val logger = coroutineContext.logger<FlexibleChecksumsResponseInterceptor<I>>() | ||
val configuredToVerifyChecksum = responseValidationRequired || responseChecksumValidation == ResponseHttpChecksumConfig.WHEN_SUPPORTED | ||
if (!configuredToVerifyChecksum) return context.protocolResponse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We don't need logger
in the case of !configuredToVerifyChecksum
and we exit early. We could move the initialization of it closer to the use site.
if (ignoreChecksum(serviceChecksumValue)) { | ||
logger.info { "Checksum detected but validation was skipped." } | ||
return context.protocolResponse | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Putting the logging statement here means we can't provide more information about why validation was skipped. I think we should log in the subclass (IgnoreCompositeFlexibleChecksumResponseInterceptor
) where we can definitely state the reason.
return when (val body = context.protocolRequest.body) { | ||
is HttpBody.Bytes -> { | ||
checksumAlgorithm.update( | ||
body.readAll() ?: byteArrayOf(), | ||
) | ||
checksumAlgorithm.digest().encodeBase64String() | ||
} | ||
else -> null // TODO: Support other body types | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why the TODO
here? Can't we just use HashingSource
and HashingByteReadChannel
to implement the other body types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can use those types. A small refactor is needed and I think deadlines shifting left me uncertain if I could get this functionality supported on time. I think we might as well do it now since the refactor will be a breaking change
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. Some minor feedback/suggestions...fix (or don't) and ship!
/** | ||
* Applies a checksum based on the requirements and limitations of [FlexibleChecksumsRequestInterceptor] | ||
*/ | ||
private fun applyFlexibleChecksumsChecksum( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: The names applyFlexibleChecksumsChecksum
and calculateFlexibleChecksumsChecksum
are pretty unwieldy. Given that these method are already in a class with "flexible checksums" in the name, it seems like we could just stick with the original names applyChecksum
and calculateChecksum
. Was there a reason to extract private helper methods with longer names here?
(Same question in HttpChecksumRequiredInterceptor.kt)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's so we can share logic for caching and non-caching checksum calculation. I gave the helper functions longer names to avoid naming conflicts with the superclass functions called applyChecksum
and calculateChecksum
.
public open fun ignoreChecksum(checksum: String): Boolean = false | ||
public open fun ignoreChecksum(checksum: String, logger: Logger): Boolean = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Instead of passing the logger
, we could pass context: ProtocolResponseInterceptorContext<Any, HttpRequest, HttpResponse>
. The full context which contains more information and may be useful to future subclasses which need more data to make a determination about ignoring a checksum. The logger would then be available to implementors as context.executionContext.coroutineContext.logger()
(or ...coroutineContext.warn { ... }
, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, let me do this before I merge
req.body.contentLength == null && !req.body.isOneShot -> { | ||
val channel = req.body.toSdkByteReadChannel()!! | ||
channel.rollingHash(checksumAlgorithm).encodeBase64String() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why does it matter if req.body.contentLength == null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming it's a requirement for toSdkByteReadChannel
and rollingHash
. This code is originally from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this was discussed in #772 (comment). That helps add some context.
Affected ArtifactsSignificantly increased in size
Changed in size
|
Other approval available and discussed offline this is ok to merge
Issue #
Description of changes
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.